Normalized Google distance について

Words near each other

・ Normalization (Czechoslovakia)
・ Normalization (image processing)
・ Normalization (people with disabilities)
・ Normalization (sociology)
・ Normalization (statistics)
・ Normalization model
・ Normalization process model
・ Normalization process theory
・ Normalization property (abstract rewriting)
・ Normalized chromosome value
・ Normalized compression distance
・ Normalized Difference Vegetation Index
・ Normalized frequency
・ Normalized frequency (fiber optics)
・ Normalized frequency (unit)
・ Normalized Google distance
・ Normalized loop
・ Normalized number
・ Normalized Systems
・ Normalizing constant
・ Normalizovaný muštomer
・ Normally distributed and uncorrelated does not imply independent
・ Normally hyperbolic invariant manifold
・ Normally unmanned installation
・ Normalman
・ Normalman (TV series)
・ Normalnull
・ Normalsi
・ Normaltica
・ Normaltown

Dictionary Lists

mini英和辞書

翻訳と辞書　辞書検索 [ 開発暫定版 ]

スポンサードリンク

Normalized Google distance ：ウィキペディア英語版

Normalized Google distance
Google distance is a semantic similarity measure derived from the number of hits returned by the Google search engine for a given set of keywords. Keywords with the same or similar meanings in a natural language sense tend to be "close" in units of Google distance, while words with dissimilar meanings tend to be farther apart.
Specifically, the normalized Google distance between two search terms ''x'' and ''y'' is
:

\operatorname(x,y) = \frac{\log M - \min\{\log f(x), \log f(y)\}}

where ''M'' is the total number of web pages searched by Google; ''f''(''x'') and ''f''(''y'') are the number of hits for search terms ''x'' and ''y'', respectively; and ''f''(''x'', ''y'') is the number of web pages on which both ''x'' and ''y'' occur.
If the two search terms ''x'' and ''y'' never occur together on the same web page, but do occur separately, the normalized Google distance between them is infinite. If both terms always occur together, their NGD is zero.
The normalized Google distance is derived from the earlier normalized compression distance (Cilibrasi & Vitanyi 2003). A closely related algorithm was described by (Allen and Wu, 2002).
==References==

* R. Allen and Y. Wu, (2002) Generality of Texts, ICADL, Singapore, December, 111-116.
* R. Allen and Y. Wu, (2005) Metrics for the Scope of a Collection, JASIST, 55,(10), 1243-1249.
*R.L. Cilibrasi and P.M.B. Vitanyi (2004/2007). (ArXiv.org (2004) ) ( The Google similarity distance, IEEE Trans. Knowledge and Data Engineering, 19:3(2007), 370–383. ).
*R.L. Cilibrasi and P.M.B. Vitanyi (2003/2005). (ArXiv.org (2003) ) (Clustering by Compression, IEEE Trans. Information Theory, 51:4(2005), 1523 - 1545. ).
*(Google's search for meaning ) at Newscientist.com.
*J. Poland and Th. Zeugmann (2006), (Clustering the Google Distance with Eigenvectors and Semidefinite Programming )
*A. Gupta and T. Oates (2007), (Using Ontologies and the Web to Learn Lexical Semantics ) (Includes comparison of NGD to other algorithms.)
* Wong, W., Liu, W. & Bennamoun, M. (2007) Tree-Traversing Ant Algorithm for Term Clustering based on Featureless Similarities. In: Data Mining and Knowledge Discovery, Volume 15, Issue 3, Pages 349–381. (the use of NGD for term clustering)
ttp://www.newscientist.com/article.ns?id=dn6924 Google's search for meaning] at Newscientist.com.
*J. Poland and Th. Zeugmann (2006), (Clustering the Google Distance with Eigenvectors and Semidefinite Programming )
*A. Gupta and T. Oates (2007), (Using Ontologies and the Web to Learn Lexical Semantics ) (Includes comparison of NGD to other algorithms.)
* Wong, W., Liu, W. & Bennamoun, M. (2007) Tree-Traversing Ant Algorithm for Term Clustering based on Featureless Similarities. In: Data Mining and Knowledge Discovery, Volume 15, Issue 3, Pages 349–381. (the use of NGD for term clustering)

抄文引用元・出典: フリー百科事典『ウィキペディア（Wikipedia）』
■ウィキペディアで「Normalized Google distance」の詳細全文を読む

スポンサードリンク

翻訳と辞書 : 翻訳のためのインターネットリソース